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Coloured Image Assessment 



This invention relates to a method, an apparatus and a computer program for 
assessment of coJoured Images: it Is particularly (although not ©xclusfvely) relevant to 
assessment of histological slides to prwWe ollnicaf fnformatlon on potentially cancerous 
5 tissue such as breast cancer tissue. 

Breast cancer la a common forni of female cancer: once a lesion Indicative of breast 
cancer has been detected, tissue samples are taken and examined by a hrstopatholqglst 
to establish a diagnosis, prognosis and treatment plan. However, pathological analysis of 
tissue samples is a time consuming and Inaccurate process* It entails Int^pretatlon of 

10 colour images by human eye, which is highly Qubjeotive: ft fe characterised by 
considerable inaccuraoies in obsen/atlons of the same samples by different observers 
and even by the same observer al dHferent times. For example, two different observers 
assessing the same ten tissue samples may easily give different opfnions for three of the 
slides - 30% error. The problem Is exacerbated by heterogeneity. I.e. complexfty of some 

16 tissue sample features. Moreover, there is a shortage of pathology staff. 

There is a need to provide an objecflve measurement of Cerb«B2, BR and PR status and 
vascularity to contribute to an effective treatment plan for the patient. 

In order that the invention might be more fully understood, embodiments thereof wll now 
he described, by way of example only, vwth reference to the accompanying drawings, In 
20 which:- 



Rgure 1 la a black diagram of a procedure for measuring indioatrons of cancer to 
.assist in fomiulating diagnosis and treatment; 

Figure 2 Is a block diagram of a process for measuring Cerb-B2 In the procedure of 
Figure 1; 

25 Figure 3 is a block diagram of a process for measuring vascularity In the procedure of 
Rgure 1 ; 

Figure 4 is a block diagram of a process for measuring oestrogen and progesterone 
receptor In the procedure of Figure 1 ; 

Figure 5 is a pseudo three dimensional view of a red, green and blue colour space 
30 (colour cube) plotted on respective orthogonal axes; 
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Rgure -fi is a transfonngtlon of Figure 5 to form a chromaficlty space; 
Figure 7 is a drawing of a t:hromat{cfty space reference system; and 
Figure 8 illustrates use of polar co-ordinates. 

The examples of Invention to be described herein are three different inventions whfch 
can be Implemented separately or together, because they are all measurements which 
Individually or collectively assist a clinician to diagnose oncer and to formulate a 
treatment programme. In descending order of importance, the procedures are 
determlnaQon of oestrogen and progesterone receptor, determination of Cerb-B2 and 
determination of vascularity. For convenience however, they are described In the order 
determination of Cerb-B2, detemilnatfon of vascularity and detemiination of oestrogen 
and progesterone receptor. 

A procedure 10 for the assessment of tissue samples In the form of hlstopathological 
slides of potential carcinomas of the breast Is shown in Figure 1 . This drawing illustrates 
processes, which generate m^surements of specialised kinds for use by a pathologist 
as the basis for assessing patient diagnosis, prognosis and treatment plan. 

The procedure 10 employs a database, which maintains digrtised Image data obtained 
from histological slides as will be described later. Sections are taken (out) from breast 
tissue samples (biopsies) and placed on respective slides. Slides are stained using a 
Staining agerrt selected from Ihe following depending on which parameter la to be 
determined: 

a) Immunohistochemical staining for Cerb-B2 wfth diaminobenzidlne (DAB) as 
substrata (chemical staining agent) - collectively "Csrb-DAB" - this is for 
assessing Cerb-B2 gene amplification status; 

b) Oestrogen receptor (ER) with DAB as. substrate (collectlvef/ "EIR-DAB") for 
assessing the expression (the amount expressed or emitted) of tfie oestrogen 
receptors. Progesterone receptor (PR) status is Investigated using chemioaf 
treaqtment giving the same colouration as In ER. 

0) Immunohistochemical staining for cp31 with fuchsin (F) as substrate for 
assessing vascularity (angiogenesis). 
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In a prior art manuar procedure, a cJinlclan places a slide under a microscope and 
examines a region of ft (referred to as a tile) at magnification of x40 for indications of 
Cerb-'B2, EH and PR status and vascularity. 

,The present invention requires data from histological slides in a suitable form. In the 
5 present example. Image data were obtained by a pathologist using Zeiss Axjoskop 
microscope ^rlh a Jenoptilis Progres 3012 digrlal camera. Image data from each slide id 
a set of digital images obtained at a finear magnificaHon of 40 {i.e. 40X), each Image 
being an electronic equh^alent of a tlie. 

To select Images, a pathologist scans the .microscope over a slide, and at 40X 
10 magnification selecte regions (tiles) of the slide which appear to be most promising in 
terms of an analysis fo be perfomied. Each of these regions is then photographed using 
the microscope and digital camera r^enred to above, which produces for each region a 
respective digftlsed Image In three colours, i.e. red, green and bid© (R. G & B). Three 
intensity valute are obtained for each pbtel bi a pixel an-ay to provide an image as a 
1 5 combination of R, G and B Image planes. This Image data is stored temporarily at 12 for 
later use. 

.Two tiles are required for vascularity measurement at 14, and one ti7e for each of 
oestrogen and progesterone receptor measurement at 1 6 and Cerb-B2 meaeunement at 
1 a. These measurements provide Input to a diagnostic report al 20. 

20 The prior art manual procedure for scoring Cerb-B2 invofves a pathologist subjectively 
and separately estimating stain intensity, stain location and relative number of cells 
associated with a feature of interest in a tissue sample. The values olDtained In this way 
are combined by a pathologist to give a ^ngle measurement for use in diagnosis, 
prognosis and reaching a decision on treatment The process hereinafter described in 

25 this example replaces the prior art manual procedure with an objective procedure. 

Refemng now to Rgure 2, a flow diagram of the Cerb-Ba measurement process 18 Is 
shown in more detail: the process 1 a Is applied to one image or tile obtained by 
. magrufyfng by a factor of 40 an area of a histological slide. The image Is a three colour 
or RGB image as defined above, Le. there Is a respective image plane for each colpur. 
30 For the pun^oses of the following analysis, the letters Q and B for each pbcol are 
treated as the red green and blue Intensities at that pixel. The RGB input Image Is used 
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at 30 to compute a cyan {mage derived from the blue and green omage planes: i.e. for 
each pixel a a/an Intensity C fe computed frorn C = (2 x B + Q)f3, the respective pExel's 
green (G) Intensity being added to twice Ife bfue (B) intensity and the resulting sum being 
divided by three* When repeated for all pixels this yields a cyan Image or Image plane. 

S At 32, a Sobef edge filter Is applied to the cyan Image plane: this is a standard image . 
processing technique published In Klette R„ & Zamperoni P., Handbook of Image 
processing operators', John Wiley & Sons, 1995. A Sobel edge filter consists of two 3x3 
arrays of numbers Sp and Sq, each of which Is convolved wrth successive 3x3 arrays of 
pixels in an Image* Here 
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The step 32 Initially selects a first cyan 3x3 array of pixels In the top left hand corner of 
the cyan image: designating as Cij a general cyan pixel In row I and column j, the top left 
hand corner of the Image consists of pixels Cti to Ci3. Cai to C23 and C31 to C33. Cii Is 
•then multiplied by the respective digit of S? located in the Sp array as C5 is in the 3x3 
15 cyan pixel array: I.e. C^^ to G13 are mutiplfed by 1, 2 and 1 respectively, Cai to C23 by 
zeroes and Q31 to C33 by -1 , -2 and -1 respecUvely. The products so formed are added 
aigebraioly and provdde a value p. 

The value of p wDI be relatively low for pixel values changing slowly between the fii^ and- 
third rows efther side of the row of C^, and relatively high for pixel values changing 

20 rapidly between those rows: In consequence p provides an incficaiion of Image edge 
sharpness across rov^. This procedure is repeated using the same pbcel array but YAth 
Sq replacing Sp, and a value q ts obtained: q is relatively tow for pbcet values changing 
slowly between the first and tfdrd columns either side of the column of C22. and relatively 
high for pixel values changing rapidly between those columns: and q therefore provides 

25 an Indication of image edge sharpness across columns. The square root of the sum of 

the squares of p and q are then computed LB.^p^+q^ , which Is defined as an "edge 

magnftude" and becomes T32 (replacing pixel C22 at the centre of the 3X3 array) In the 
transformed cyan Image. It is also possible to derive an edge "phase angle?' as tan"''p/q. 
but that Is not required in the present example. 
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A genetel pixel T? (row t, column j) in the transformed Image fs derived from Chj-i to Q. 
ijfi. to Ci^i and Cu-ij-i to Cmj^ of the cyan Image. Because the central row and ' 
column of the Sobel filters In Equation (1) respectrvely are zeros, and other coeffjolerits 
. . > are 1 s and 2e, p and q for Tg can be calculated as follows: 

5 + 2Cm J + Cn.j,.i} - { C,+, + 2 C 14., J + Cm.h} (2) 

q = { Q.1 j-i + 2Q,j.i + C,^T.j.i} - { Cm.j»i + 2C + C wjrt } (3) . 

Beginning with l=i = 2, p and q are calculated for successive 3X3 pbcel arrays by 
aicrenientlng j by 1 and evaluating EciuatfonB (2) and (3) for each such army until the end 
of a row Is reached; j fs then Incremented by 1 and the procedure Is repeated for a 
10 second TOW and so. on untn the whole Image has been transformed. This transformed 
image Is refenred to below as the -^obel of Cyan" Image or Image plane. 

The Sobel filter cannot cafoufate values for pixels at Image edges having no adjacent 
pixels on one or other of its sldea: l.e. In a pixel an^y having N rows and M ooJumns, 
edge pixels are the top and bottom rows and the first and last columns, or In the 
1 5 transfomied image pixels Tn to Tim, Jni to, Tnm. T,i 1o Tim and T,m to Tmm- By convention 
in Sobel ffltering these edge pixels are sat to zero. 

The next step 34 is to compute the mean and standard deviation of the transfomied pixel 
values Tq. For convenience a change of nomenclature is implemented: Index k Is 
substituted for i and j. i.e. k = 1 to NM for i, j = 1, 1 to N, M: this treats a two dimensional 
20 image as a single composite line composed of successive rows of the Image. Also x is 
substituted for T in each pixel value> so Tj becomes Xk. The foilowing Equations (4) and 
(6) respectively are used for computing the mean n and standard deviaSon ct of the 
transfomied pixels xk. 




(4) 



J. 1 iSBW 



(6) 
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At 36, various statistical parameters are computed for the Red, Green, Blue and Cyan 
imsge planes using Equations (4) and (5) above and Equation (6) below. 



Skewness = 




(6) 



For the Red image plane the statistioal parameters are the mean \i, standard devotion cr 
5 and Newness of its pixel values: In Equatlone (4) to (6) Xb (Ic = 1 to NM) represents a 
general pixel value in the Red image plane. In addition, the Red Image plane's pixies are 
compared with one another to obtain their maximum, minimum and range (majdmum - 
minimum). Similarly, pixels in each of the Green and Blue image planes are compared 
with one another to obtain a respective maximum, minimum and range for each of these 
10 planes. Rnally, for the Cyan Image, pixels' mean and standard deviation are computed 
using Equations (4) and (5), in which ^ represents a general pixel value in the Cyan 
image plane. 

At 38, a colour segmentation is computed from the Red, Green, Blue, Cyan and Sobel of 
Cyan image planes. Step 38 fs shown in more detail within chain lines 40, Colour 
15 segmentation is Implemented by applying three sets of adaptive threshoiding operations 
referred to as D, E and F respectively. Each set of operations involves two respective 
numerical factors: D has factors D1 = 0.802 and D2 - 1.24, E has El = ES = 0.903, and 
FhasFI ==1.24 and F2 = 0.8Q2. 

The first set of thresholding operations is D using factors D1 and D2. It implements the 
20 following steps: 

(a) Prc>duoe a thresholded image for the Red image plane as follows: for every Red 
pbcel value that Is less than an adaptive threshold, set the oon^spondlng pl^^el 
location In the thresholded image to 1, otherwise set the latter to 0. A respectrve 
adaptive threshold Is computed separately for every pixel location as follows: 

25 (I) Select the same pixel location In the cyan Image and from It search in four 

direcDons - the north, south, east and west directions - for seventy pbce! 
locations (or as many as are available up to seventy). Here north, south, east . 
and west have the following meanings: north: upwards from the pixel In the 
same dolurhn; south: downwards from the pbcel in tfie same column; east 



rtghtward from the pixel In the same row; and west leftward from the pixel in 
the same row. More directions could be ueed to improve accuracy but four 
have been found to be adequate for the present process. If any of the 
seventy looations In a respective direction has a pixel value less than the 
product of D2 and the Cyan mean \io then the direction is assigned a value of 
1 , otherwise it Js assigned a value of 0. 

(li) Sum all four di'reclion values to provide a sum Sd and set the adaptive 
threshold for tfie current pixel location to whichever is the lesser of 255 and 
(D1 X Red mean hh) + (Sd - 4) x Red standard deviation a*0- 

(h) Produce a thresholded image for the Cyan image plane as follows: using the 
Cyan mean jxc from step 34, for every Cyan pixel value that Is less than (D2 x jich 
set the pixel in the con'esponding location in the thresholded image to 0, otherwise 
set the latter to 1 . This has the effect of removing excess brown pixels. 

(c) Produce a thresholded image for the transformed or Sobel of Cyan image plane 
as follows: using the C^an mean and standard deviation ao from step 34: i.e. for 
every Cyan pixel value that Is greater than die + 1.5oc) set the conresponding pixel 
in the thresholded Image to 0, othenwise set the latter lo l. This has the effect of 
removing excess edge pixels. 

(d) Using the pbcel minimum and range values computed at step 36, a single 
thresholded binary image Is now produced using data obtafried from the Red, 
Green and Blue image planes. This is oanied out as follows: for eac^ Red, Green 
and Blue pixel group at a respective pixel location that satisfies all three criteria at 
(I) to (81) bdow, set the pixel at the corresponding location in the thresholded Image 
to a, othenwiae set frie latter to 1 ; this has the effect of removing lipid Image regions 
(regions of fat which appear as highly saturated white areas). The criteria for each 
cQ^ocated Red, Green and Blue pixel group are; 

(i) Red pixel > Red minimum + 0,98 x (red range), AND 

(li) Green pixel > Green minimum + 0.98 x (green range), AND 

(rii) Blue pixel > Blue minimum + 0.98 x (blue rarige) 
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(^) The next step Is to apply to the binary image obtained at 38d above a series of 
morphological closing oper^tion^, which consist of a dilalion operation followed by 
an erosion operation. These morphological operations are to fuse narrow gaps and 
eliminate small holes in Individual groups of contiguous pixels appearing as 'blobs 
In an Image. It can be thought of as removal of Irregularities or spatfai "noise", and 
it is a standard Image processing procedure published In Umbaugh S.C, 'Colour 
vision and image processing', Prentice Hall, 1998. The number of operations 
applied depends on circumstances: operations (i.e, filling small gaps and/or holes) 
terminate when the numb^ of changes made by the erosion operation is greater 
than the number made by the dilation operation. 

A connected component labelling prooess is now applied to the binary image 
produced at 38e. This is a loiown image processing technique (sometimes referred 
to as *blob colouring') published by R Klette and P Zamperoniu, 'Handbook of 
Image Processing Operators', John Wiley & Sons, 1998, and A RosenfeW and A C 
KsK. 'DrgHal Picture Processing', Vo\s\ 1 & 2, Academic Press, New York, 1982. It 
gives numericpl labels to Islobs" in the binary image, blolDS being regions or grogps 
of nke-valued contiguous or connected pixels in an image: j.e^ eacii group or blob 
consists of connected pbcels which are all ISs and each Is assigned a number 
different to those of other groups^ This enables IndiA^'duat blobs to be tfistlnguished 
from others by means of thetr labels. The number of iabeiled image regions or 
blobs jn the Image Is computed from the labels and output. Connedied component 
labelling also determines each l^eiied Image region-'s centroki (pixel location of ' 
region centre) and area, which are used later as will be described. 

As indicated by arrows 42, the second and third sets of thresholding operations E and F 
are carried out by iterating steps 38a to SSf above using factors E1/E2 and F1/F2 
respectively. 



(g) A technique referred to as the Downhill Simplex method is now applied to the 
results of the thresholding operations obtained from 38a to 38f above. It is a 
standard iterative statistical technique for multidimensional optimisation published 
In Nelder J. A., Mead a, 1965, Computer Journal, vol. 7, pp 308-313, 1966. In the 
present ^cample the technique is used to determine which pair of factors D1/D2, 
E1/E2 or F1/F2 used above yields the maximum nunber of Image regions or 
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groups of connected pixels. Jt takes as fnput the three pair of factors D1/D2, E1/E2 
and F1/F2 together with the function returning the number of detected regions and 
described at 38a and 38b above for adaptive thresholding. 

The proceduHB is now conoemed wtth detemrifning quantities referred to as "grand mean" 
5 and -mean range" to be defined later. If the Downhaj Simplex method 38g hks 
ctetermined that there are not more than a user specffed numljer of image regions, 
sixteen In the present exampFe, then at 44 processing Is switched to 46: here both grand 
mean and mean range are set to 1.0 and processing passes to step 54 (to be described 
later) to compute a required result 

10 If the Downhill Simplex method has determined that there &r& more than sixteen image 
regions, then at 44 processing Is swflohed to 48 where a seareh to characterise these 
regions' boundaries Is oanled out The search uses each region's area and centroid pbcel 
location as oiatalned in connected component labelling at 38f. and each i^fon is 
assumed to be a cell with a centroid which is the centre of the cell's nucleus. This 
1 5 assurtiption Is Justified for most ceils, but there may be misshapen ceils for whloh it does 
not hold: it is possible to discard misshapen cells by eliminating those with concave 
boundary regions for example, but this Is not Implemented in the present example. 

The search to characterise the regions' boundaries Is carried out along the respective 
north, south, east and west directions (as defined earlier) from the centmid: it is earned 
out In each of these directions for a distance 5 which is either 140 pbcels or 
2^region area , whichever is the lesser, It employs the cyan Image because eixperlencs 
shows that this image gives the best defined ceH boundaries wfth the slide staining 
previously described. Designating Qj as the intensity of a region's centroid pixel in the 
cyan image at row I and column J. then pixels to be searched north, south, east and west 
of this centroid will have intensities in the cyan image of Cr^u to (Xsj. Q-ij to d^. C^.^ to 
Cy+a and C,j.i to Ci,|.s respecHvely. The cyan Intensity of each of the pbcels to be searehed 
is subtracted from the centroid pixel's cyan intensity Cj to produce a difference value, 
• which may be positive or negative. In a cyan image, a cell nudeue is nonnally blue 
whereas a boundary Is brown (with staining as described earlier). 

30 Each Pixel Is then trsalBd as being part of four linear groups (or windows) of six, twelve, 
twentyi-four and forty-eight pixels each- Including the pixel and extending from it in a 
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continuous line riorth^ 30Lith, east or west according respecavely to whether the pixel is 
north, ^uth, east or west of the centroid. C}+.ij for example te groupBd with Cu5j to C^+gj, 
Cj^j to Cu-i2j, Ci+aj to Ck24j3 and 0^24 to C1448J (inclusive jp each case). This provides a 
total of 165 groups from 48 groups in each of four directions r For each group the 
S difference between each of its pixels' cyan Intensities and that of the centrold (s 
calculated: the differences are summed over the group afgebraically (posrfive and 
negative differences cancelling one another) and divided by the number of pixels in the 
group to provide a net difference per pixel between the cyan (nteneifies the group's 
pbceis and that of the centroid. 

10 For each direction, i.e. north, south, east and west^ there is now a respective set of 4S 
net dffferences per pixel; In each set the net differences per pi^iel are compared and their 
maxinnum value Is Identified. This produces a respective maximum net difference per 
pixel for each of the sets. Le. fcur each of the north, south, east and west directions, and 
sfze of wlhdow (number of pixels in group) in which the respective maximum occurred. 

15 The four maxima so obtained (one for each direction) and the respective window size in 
each case are stored* Each maximum is a measure of the region boundary (ceil 
membrane) magnitude in the relevant direction, because In a cyan Image the maximum 
difference as compared to a blue cell nudaus occurs at a brown boundary. The window 
size associated with each maximum indicates the region boundary width, because a 

20 boundary wjcfth will give a higher maximum In this technique with a window size which it 
more nearly matches as compared to one it matches less well. Greater accuracy is 
obtainable by usFng more window sizes. 'A further option ts to record the position of the 
maximum or boundary as- being that of one of the two pixels at the centre of the window 
in which the maximum occurs: this was not done In the present example, eilthgugh it 

25 would enable misshapen cells to be detected and discarded as being indicated by 
significant differences in the posHions of maxima in the four directions. 

The next step 50 is to apply what is refen'ed to as a "quicksorT to the four boundary 
magnitudes to sort them into ascending order. Quicksort is a Icnown technique published 
by Kiette Rl, Zamperoniu P.. 'Handboolt of Image Processing Operators', John Wiley & 
30 Sons, 1996. and will not be descnl^ed. For each Image region, measurements made as 
described above are now recorded In a respective 1-dlmensIonal vector as set out In 
Table 1 below. 



t1 



TABLE 1 



Element number 


Parameter 


0 


North boundary magnftudd 


1 


South boundaiy magnitude 


2 


East boundary magnitude 


3 


Wast boundary magnitude 


4 


Ragioh area (from 38f) 


5 


North tjoundary vAdHh 


6 


South boundarv width 


7 


East bound£try width 


8 


West boundary width 


9 


Sum of North, South, East and Wast 
boundary magnrtudes 



A further quicksort now applied (also at 50) to the regfons to sort them into ascending 
order of element 9 values in Table 1 above, I.e. sum of north, south, east and west 
boundaiy magnitudes. A subset of the regions is now selected as being those having 
6 large values of elem^ 9; these most significant regions are typically the top one eighth 
of the subset of regions In terms of element 9 magnftude. From this subset of regions the 
following parameters are computed at 52, "grand mearf , Imean range" and "relatlva 
range' as defined below : 

ocHIe «= one eighth of the number of regions in the subset 
1 0 boundaries a boundary magnitudes 

S = sum of . (over all regfons In the subset) 
element 1 = south boundary magnitude 
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element 3 west boundary magnrtude 



(12) 



grand mean ex[(S eagt boundaries) + (E vyest boundaries) 

+ (S north boundaries) + CE south boundaries)]/ 4odtlIe 

mean range = [CS boundary element 3) - (2 boundary element 1)]/ocMI©- 



(13) 



C14) 



S relative range 1 Oxmean range/grand mean 



(15) 



At 54 an overall distance measure fs computed: this measure provides an estimate of 
how far the current cyan image (used to derive grand mean eto aboye) Is from each 
. member of a predetennined standard set of images, four images fn the present example. 
In this e;@mpla the distance measure Is computed against a set of four predetemilned 

10 standard Images: the standard Images were obtained by dividing a laige test datasel of 
Images Into four different Image types con-espontjing respectively to four different Cerb- 
B2 status Indicators (as will be described later in more detell). The images of each 
image type were analysed to determine grand mean and relative range for each image 
as described above. A respective average grand mean Mr 0 = 0, 1, 2 and 3) and a 

15 respective average reJsUve range skewi were delemnined for the images of each of the 
four image types. As an alternative. It Is also possible to select four good quality Images 
of the relevant types by inspection from many images, and to determine M,- and skeWi 
from them. The values Mi and skew become the components of respective four-element 
vectors M and skew , and are used in the following expression: 

20 Ceri>B2 indicator = min^fM^ - grand meanf + {skew^ - relative rangef} (16) 

where min/ is the value qf / ( / = 0, 1, 2 or 3) for which the expression wfthln curved 
brackets { ) on the right of Equation (16) Is a minimum. For the vector M, from the 
dataset the following elements were determined: Mo = 12J2, hA, = 23,1 6, Ma = 42.34 and 
M?^ = 87.3S); elements determined lii«wlse for the vector skew were steMro=2.S01, 
26 e/fewt-1.as, sftew&^l.lll and s*ew3= 0.3394. The value of / Is returned as the 
Indicator for the Cerb-B2 measurement process. 

If a value of /= 3 Is obtained In the Cerb-B2 measurement process, this is regarded as a 
strongly positive result: the patient .from whom the original tissue samples were taken is 
regarded as highly suitable for treatment, ounrently with herceptin- A value of / 2 is 
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weakfy posiifva Indicating doubtful suHabilfty for treatment, and /= Vor 0 Is a negative 
result indicating unsuftabfllly. This is t^lated below in Table 2. 

TABIDS 



Cerb-B2 status 


I Value 


Strongly positive 


3 


Weakfy positive 


S 


Negative 


0,1 



10 



16 



20 



Referring now tq J=igure 3, there is shown a flow diagram of the process 14 (see Figure 
1) for measurement of vasoularHy. The prooesa 14 is applied to two Images each of x40 
magnification compared to the histopathological slide from which they were taken. At do 
each image is tfansfomied from red^greenflaiue (RGB) to a different image space 
hue/saturationA/alue (HSV). The RGB to HSV transformation Is describe'd by K. Jack in 
•Video Demystified', 2"" ed.. HIghText Publications. San Diego, igaa in practfoe value V 
(or brightness) is liable to vary due to staining and thickness variations across a sfide, as 
well as possible, vignetting by a camera lens used to pwduce the Images. In 
consequence in this example the V component is ignored: it is not calculated, and 
emphasis is placed on the hue (or colour) and saturation values H and S. H and S are 
calculated for each pixat of the two RGB images as follows: 



Let IVI = maximum of (R,G,B) 
Let m = minimum of {R,G,B) 
Then newr = (M - R)/(1M - m) 

newg = (M- G)/(M - m) and 
newb = (M - By(M - m) 



(17) 
(18) 

m 

(20) 
(21) 



This converts each colour of a pixel into the difference between Its magnitude and that of 
the maximum of the three colour magnitudes of that pixel, this difference b^ng divided 
by the difference between the maximum and minimum of (R,Q,B). 

Saturation (S) Is set as follows: 
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' ff M equals zero, then S = 0 (22) 

If M does not equal zero, then S = fl^d - m)/hA (23) 

The oaicuiat'onfor Hue (H) is as follows: Iram Equation (17) M must be equal to at least 
one of R. G and B; 



5 


if- M equals zero, then H si 80 


(24) 




If M squate R then H - 60(newb - newg) 


(26) 




If M equals Q then H = 60(2 + newr - nevwb) 


(26) 




If M equals B then H = 60(4 + newg - new) 


(27) 




If H Is greater than or equal 380 then H - H - 380 


(28) 
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If H is less than 0 then H = H + 360 


(29) 



The VaJue V not used in this example, but were ft to be used tt would be set to the 
maximum of CR»QiB). 

Tjie next step 62 te to apply colour segmentation to obtain a binary [mage. This 
sesmentation is based on thresholding using the Hue and Saturation from the HSV 
1 5 colour space, and Is shown in Table 3 below. 



TABLES 



Threshold Crfterion 


Binary Image Pixel Value 


Pixel with both Hue H fn the range 282 - 356 degrees 
(scale 0 to 360), and. Saturation S In the range 0.2 to 
0.24 (scale 0 to 1) 


Set pixel to 1 


Pixel with either Hue outside the range 282 - 356 
degrees, and/or Saturation outside the range 0.2 *- 0.24 


Set pixel to 0 



This produces a segmented binary image. 



The next stage 64 19 to apply connected component labelling (as defined previously) to 
the segmented binary Image: this provides a binary Image with regions of contiguous 
20 pixels equal to 1 . the regions being uniquely labelled for further processing and their 
areas being detennined. The labelled binary image is then spatially filtered to remove 
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small connected components (Image regtone vwth less than 10 pixels), which provides a 
reduced binary Image. 

The sum of the area of the remaining image tegtons in the recfuoed binary Image is then 
determined at 6$ from the results of connected component labelling, and this sum Is then 
6 expressed as a percentage of the area of the who!© Image. This pracedurs is can-led out 
for both of the original RGB images separately, to provide two such percentage area 
values: the average of the two percentage area values Is computed, and ft represents an 
estimate of the percentage of the area of a tissue sample occupied fay blood vessels - 
i.s. the sample vascularity. 

10 As set out In Table 4 below, vasculaHty is detemiined to be high or low dependlrig on 
Whether or not it is equal to at least 31 %. 



TABLE 4 



Description of vesoularity 


Range 


High 


31% -100% 


Low 





15 High vascularity con-esponds to relatively fast tumour grcw&i because tumour blood 
supply has been facilitated, and eariy treatment Is Indicated. Low vascularity corre^nds 
to relatively slow tumour growth, and early treatment Is less imporlant 

Referring now to Figure 8, there is shown a flow diagram of a process 16 for the 
measurement of Oestrogen (ER) and Progesterone (PR). The process for measuring 
20 these two Is the same except for their use of different Immunohlstochemicai staining 
processes applied to tissue samples (slide colouring). The process will be described for 
the case of Oestrogen receptor assessment witfi diaminobenzldine as staining agent 

A" dfgftfsed Image of a histcpathqlogical slide or part of such a slide part can be 
(rfiaractertaed for Its response to oestrogen, for any Immunohlstochemicai stain used. A 
25 user can adapt the process to be described below by defining an appropriate refHrence 
colour, v^toh must represent a sample with maximum Intake of oestrogen. There are two 
main colours In the pixels of ER medical images, i.e. brown and blue. The same colours 




IS 



appear In PR imagery, and so the image processing technique is the same. In this 
example the process determines brown p?xels in an Image that are similar to a reference 
brown colour defined by e user of the process. 

Refemng now to Fcgure 4, processing 16 to determine ER status will be outlined and 
5 then descnbed in more detail later. If begins wHh a pre-processing stage 70 in which a K- 
means clustering algorithm Is applied to a dolour [mage using a Mahalanobis metnc. Tfils 
' determines or cues image regions of interest for further processing associating pixels 
Into clusters on the basis of their having similar values of the Mahalanobis metrio. At 72 
ttie colour Image Is transfomned Into a chromaticily space which Includes a location of a 
10 reference colour. Hue and saturation are calculated at 74 for pixels in clusters cued by K- 
means clustering. The nunr^er of brown smlned pixels Is computed at 76 by thresholding 
on the basis of hue and saturation. An ER status mmsurement is then derived at 78 from 
a combination of the fraction of stained pixels and average colour saturation. 

The Input for the ER preprocessing stage 70 consists of raw dlgftal data files of a single 
15 colour histopalhological Image or tfle. A triplet of image band values for each pix^ 
represents the colour of that pixel In its red, green, and blue spectral components. These 
values in each of the three image bands are In the range [0<..£!6S], where [0,0,0] 
corresponds to blacl< and [255,255^5] corresponds to white. 

The K-means clustering algorithm 70 Is applied to the digital colour image using clusters 
20 and the Mahalanobis distance metrio. A cluster is a natural grouping of data having 
similar values of the relevant metrio and the Mahalanobis dfetence metric is a 
measurement that gives an indication of degree of closeness of data items to a cluster 
centre. It Is not essential to use four clusters or the Mahalanobis distance metric but 
these provide an adequate subdivision of Bie data and a convenient metric. The K- 
26 means algorithm Is described by J. A. Hartigan and M. A. Wong, in a paper entWed 'A 
means clustering algorithm'. Algorithm AS 136, In the AppBed Statistics Journal, 1979. 
The Mahalanobis distance metric Is described by F. Heljden, in 'Image Based 
Measurement Systems — obgect recognition and parameter estimation', John Wiley & 
Sons, 1994 and by R Schalltoff , in Pattern Recognition - Statistical, Structural and 
30 Neural approaches', John Wiley & Sons Inc., 1992. The process comprises an 
initialisation step a) followed by computation of a covariance matrix at step b). This leads 
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to a likelihood calculation at step c), which effectively provfdee the distance of a pixel 
from a cluster centre. The procedure is as,follows: 

a) Initially, cluster centres are set at random Jocatlons and approximately equal 
numbers of pixels are arbitrarily assigned to each cluster for later readjustment 

b) For each cluster the following computations are carried out: 

I) Compute elements of the Wndo*^ of a covarfance matrix of the Image bands indicating 
the degree of variation between intensities of cfifferent coloure in pixels of each cluster : 



^-^Z.\P^ -Mi^f^ -f^]) (30) 

where: is the if^ element of the covariance matrix, 

10 aA la the number of pixels In cluster /r, 

and qr are the values of pixel I in image bands / and j, 
I, J take values 1, 2, 3, which represent the red, green and blue image bands 
respectively^ 

/i* Is the mean of all pixels fn Image band / belonging to duster K and 
15 jUjiQ the mean of a]l pixels In image band / belonging to duster Ic 

V 

il) Calculate the determinant of the covarfance matrix denoted as ^ 



iii) Calculate the Inverse of the covarrance matrix denoted as ^ . 

'jm» 

c)- With / denoting pixel number, each pixel 5. is now treated as a vector having three 
elements j:^,.:i:^2» which are the red (x,^), green (x^2) blue (xjgjpixel values: 

20 the red, green and blue Image bands are therefore represented by second subscript 
Indices 1, 2 and 3 respectively. With / ranging over all plxele in a cluster k, the likelihood 
d^{x^)ox a pbceJ vector not belonging to that cluster is computed from Equation (31) 
bebw: 




10 
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where ^ and ^ are as defined above, 

fj^ IS the mean of all pixel vectors in cluster k, and 
t indicates the transpose of the difference vector (5, - ) . 

Equation (31) is re-^va!uated for the same pixel vector 5^ in all other clusters also- Pixel 
vector 5^ has the highest likeKhood of belonging to a cluster (denoted R,J for which 
rZ*(xJhas a minimum value i.e. ^i*- (5;)}; duster km fe *hen the most suitable to receive 
pixel 5f;i.e. find:- 

4:'-(^)</ (^,) forall ik^fc, (32) 
Assign pixel to cluster Kff^ 

d). For each cluster ft: ' ' 

Store a record of which pixels belong to cluster A* as an array updaie ft with each pbcel 
vector assigned to that cluster and update tiie number of pixels tn that cluster. 

Calculate the cluster centre fi] for each image band j = 1 . 2 and 3 

IS /^;-iri-* (33) 
iV 

Iterate steps b) to d) until convergence, i.e. when no more pb<els change clusters or the 
number of iterations reaches a total of 20. 

ITie first cluster (k = 1) now con'esponds to cell nude! and the con^spondlng pixel 
vectors are those virtiich are cued as of interest for output and further processfng. 

20 

Transformation of the image at 72 from red/green/blue (RGB) to huefeaturatfon/vaiue 
(HSV) Is as described by K. Jack in Video Demystffied', 2^ ed., HighText Publications, 
San Diego, 1996. In practice value V (or brightness) Is Bable to vary due to staining and 
thickness variations across a slide, as well as possible vignetting by a camera lens used 
25 to produce the images. In consequence in this example the V component is ignored: it Is 



not calculated, and emphasis is placed on the hue (or colour) and saturation values H 
and S, H and S are calculated for each pbcel of the RGB-Image as folbws: 



(a) Refen^fng now/ also to Rgures 5 to 8. each RGB image is transformed into a 
chromattcity space. Figure 5' shows an RGB cube 100 in which red, green and 
blue pixel values (expressed as R,, G and B respectively) are normalised and 
represented a^ values In the range 0 to 1. These pixel values are represented 
on red, green and blue axes 102, 104 and 106 respectlvBly. The chromaticity 
Space is a plane 1 08 for which R+G+B = t : it is triangular within the RGB cube 
100 and passes through the points (1,0,0), (0,1,0) and (0,0,1). 

(b) Rgure 6 shows the axes 102, 104 and 106 and ohnomaticfty space 108 looking 
broadly speaking along a diagonal of the RGB cube 100 from the point (1,1,1) 
(not shown) to the origin (0,0,0) now referenced O for convenience. The points 
(0,0,1), (0,1,0) and (1,0,0) fn Figure 5 are now referenced J, K and L 
respectfvely. D is a midpoint of a straight line between J and L Image pixel 
values from the Input RGB image are projected on to the chncmatlcity space 
108 and the resulting projections become data points for further processing. 
The projection calculation Is as follows: 

Red green and blue pixel chromaticily values r> g and b respectively are 
defined as:- r = ■ , g = — , and b = ^ — — (34) 

Perpendiculars from a point P In the chromaticity space 1 08 to the lines JK and 
LD meet the latter at E and G respectively. Perpendiculars from P and G to the 
plane JOK meet the latter at F and H respectively. Using Equations (34), the 
poinf P in the triangular ohromatlolty space 108 may then be defined by x and 
y co-ordinates shown In Figure 6 and given by: 

x^DE=^HF:^^~' and y^PE^GD^b^ (35) 

(c) In Figure 7, the chromatlclty spaoe 108 is shown with x and y co-ordinate axe^ 
extending from an origin Q. A reference colour denoted by a point S In the 
drawing Is now defined as that specified for this purpose by a clinician: it Is the 
colour of that part of the Image which Is most positively stained (the most 
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intense colour on the part of the original slide from which the Image was 
taken). The reference colour's RGB components are taken from the image and 
Its X and y co-ordinates are computed using Equations (34) and (35): these co- 
ordinates are denoted as 

5 (d) In Rgure 8, a polar co-ordinate system {r,e) is now defined on the (R+G+B=1) 

plane 108. The coiDrdinate ^syslern origin ie the centre of gravity Q of the 
triangle i09. A reference direction for 6 = 0 is defined as the direction QS of 
the radius vector to the reference colour S in Rgure 7- For any point such as P 
on the triangle defined as having co-ordinates (jc,;)^) in the HSV colour space, 

10 hue H is defined as the angle ^ between the radius vector (e.g. OP) to itself 

and the reference colour direction QS. This is computed* from the following 
expressions for 4): 

- JL Jiry — ;nr . . . :ar-hyy 



and the angle jJIs defined to be sto,"^ . 

15 For convenience the definition of hue H fs now altered somewhat to render all values 
positive and In the range 0 to itfZ: the traneformation of earlier values ^ Into a new 
version ^ is shown In Table 5 below: 
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TABLES 



Condition 


Magnftudeofv (New Hue H) 


sin 4> > 0 and cos ^>0 


* 


9in ^ > 0 and cos <|>< 0 




Sin ^ < 0 and cos ^>0 


-* 


sin (I><Ofiindcos 


4»- 71 



A hue (H) ihre^old Is set by a user or programmer of the procedure as being not 
more then 7i/2, a typical vatud which might be chosen being 80 degrees. Saturation S \s 
6 defined to be 

saturation - ^ — =^ 

Two values of saturation threshold So are set according to whether or not image pixel 
saturation value S lies In the range 0.1 to 1 .9: this is set out in Table 6 below: 

TABLES 



Saturation S 


So 


Ether S<r 0.1 orS>1.9 


0 


0.1 ^ 1.9 


0.9 



At 106, the threshofds are used to count selectively the number Nb of pixels which are 
sufficiently brown (having a large enough vaJue of saturation) having regard to the 
referenoe colour. All H and S pixel values in the image are assessed: The conditions to 
' be satisfied by a pixel's hue and saturation values for it to be counted In the brown pbcel 
1 5 number Nb are set out in Table 7 below. 

TABLE 7 
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Conditron 


Action 


For each pixel with both hu^ modulus 
\Tff\ < and saturation S>S^ 


Treat as a "saturated" pixel; 
increase count of brown 
pixels by 1 


For each pixel with > i/^ and/or 
saturation S <Sq 


Treat as an "unsaturated" 
pixel; leave unchanged 



The average saturation of the At saturated pixels determined in Table 7 is computed by 
adding all their saturation values S together and divlcfing the resulting sum by Afe. The 
maximum satursrtion value of the saturated pixels Is then determined, and the average 
5 saturation is expressed as a percentage of this inaximum: the average saturation Is then 
accorded a score at 78 of 0, 1, 2 or 3 according respectively to whether this percentage 
IS (a) < 25%, (b) > 25% and ^ 50%, (c) > S0% and ^ 75% or (d) > 75% and < 100%. 

The fraction of saturated pixels - those stained suffrciently brown - Is computed at 78 
from the ratio where N is the total number of pixels in the image. This fraction is 
10 then quantised to a score in the range 0 to 5 as set out In Table 8 below. 

TABLES 



Ni/N : RacHon of image 

pixels that are stained 


Score 


Q.OO 


0 


<0.01 


1 


0.01 -0.10 


2 


0.11 -0.33 


3 


0.34 -O.BB 


4 


0.67-1.00 


6 



The tv/o scores determined above, Le. for average colour saturation and fraction of 
suffidehtly bnDwn pi^^Is are now added together to give a measure in the range 0 to 8, 
The higher this number Is, the more oestrogen (ER) positive the ^mple Is. 
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Descrfptlon of ER status <ER Score) 


1 Range ' 


Strongly posMve 


7-8 


Positive 


4-6 


Weakly positive 


2-3 


Negative 


0-1 



Women with an ER score of 7 or 8 will respond favourably to hormonal treatment such 
5 as Tamoxifen; women with an ER score in the range 4 to 6 will have 50% of chance of 
responding to this treatment. Women scoring 2 or 3 will not respond very well, and those 
scoring 0 or 1 will not respond to homional treatment at ail. 

Images for ER and PR are Indistinguishable visuaify and they are distinguished by the 
facrt that they are produced using different stains. A PR score id thetefore producsd from 
10 stained alldea in the same way as an ER score described above. The sfgntfiMnoe of 
progestereme receptor (PR) poaitlvity in a breast caroinoma Is less well understood than 
ttie equivalent for ER. In general, cancers that are ER positive will also be PR positive. 
However, caKdnomas that are PR positive, but not El^ positive, may have a worse 
prognosis. 

IS The process st^s described in the examples of aD three Inventions described herein are 
not all essential and alternatives may be pnavlded. It is for example possible to omit a 
step erf Ignoring unsuitably small areas In or selecting areas for later processing, if the 
consequent increase In processfng burden is acceptable. The above samples are 
Intended to provide an enabnng disclosure, not to Bmit the invention. 
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Apply Sobel ffltsr to Cyan 
frnage ^ 



Compute mean and 
standard deviation on 
Sobel fclter output 
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images 



Colour Segmentation 
Process 




r 



— ^ 



Adaptive thresholding 
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Boundary analysis 
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§0 



Compute grand mean 
and mean range from 
boundary analysis 
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Apply color segmentation using the 
Hue and Satumtion imags planes 
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Conneoted component labelling 
and reject unsuitably small regions 
64 
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Measure area of remaining regions 
fig 



Rgure 3 Vascularity process 



Apply K-means olusterlng using 
Mahalanobls rhatri^. 

70 



ir 





Transform image into ohromatlclty 
space having reference colour 
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Compute Hue and Saturation for cued 
points of interest 
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ptxefs by thresholding from Hue and 

saturation 76 
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ER or PR measurement based on 
fraction of stained pixels and average 
colour saturation jq 





Figure 4 ER & PR measurement 
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